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Abstract — Compressed sensing is by now well-established as 
an effective tool for extracting sparsely distributed information, 
where sparsity is a discrete concept, referring to the number of 
dominant nonzero signal components in some basis for the signal 
space. In this paper, we establish a framework for estimation 
of continuous-valued parameters based on compressive measure- 
ments on a signal corrupted by additive white Gaussian noise 
(AWGN). While standard compressed sensing based on naive 
discretization has been shown to suffer from performance loss 
due to basis mismatch, we demonstrate that this is not an inherent 
property of compressive measurements. Our contributions are 
summarized as follows: (a) We identify the isometries required 
to preserve fundamental estimation-theoretic quantities such 
as the Ziv-Zakai bound (ZZB) and the Cramer-Rao bound 
(CRB). Under such isometries, compressive projections can be 
interpreted simply as a reduction in "effective SNR." (b) We 
show that the convergence of the ZZB to the CRB provides a 
criterion for determining the minimum number of measurements 
for "accurate" parameter estimation, (c) We provide detailed 
computations of the number of measurements needed for the 
isometries in (a) to hold for the problem of frequency estimation 
in a mixture of sinusoids. We show via simulations that the design 
criterion in (b) is accurate for estimating the frequency of a single 
sinusoid. 



I. Introduction 

Compressed sensing has proven remarkably successful in 
exploiting sparsity to extract information from signals with 
only a small number of measurements. The standard approach 
has two stages: first, take multiple random projections of the 
signal, with the number of projections growing linearly with 
the sparsity and only logarithmically with the dimensionality 
of the signal. Then, use one among a variety of recovery algo- 
rithms, such as Hi reconstruction/Orthogonal Matching Pursuit 
(OMP), to estimate the signal from the random projections. 
In this standard framework for compressed sensing, sparsity 
is an inherently discrete concept: the number of nonzero 
signal components in some basis has to be small compared 
to the dimension of the signal. In this paper, we investigate 
the effectiveness of compressive measurements in estimating 
continuous valued parameters from signals that are corrupted 
by AWGN, when the dimensionality of the parameter set is 
much smaller than the signal dimension. 

It is possible to apply standard compressed sensing to 
continuous-valued parameter estimation, but it does not per- 
form well. Consider the problem of estimating the frequencies 
in a mixture of sinusoids, which has many applications includ- 
ing estimating Angles-of-Arrivals (AoAs) at arrays and pitch 
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detection. Typically, the number of sinusoids is much smaller 
than the number of samples and, therefore, the signal is sparse 
in the frequency domain. However, the results of conventional 
compressed sensing do not apply directly, since they require 
the signal to be sparse over a finite basis, whereas the fre- 
quencies could lie anywhere on a continuum. Straightforward 
application of compressed sensing recovery algorithms after 
discretizing the set of frequencies has been shown to result 
in error floors due to "basis mismatch" and the consequent 
spectral leakage [1]. This observation raises some fundamental 
questions. Do compressive measurements preserve all the in- 
formation needed for continuous valued parameter estimation? 
If so, under what conditions? How many measurements do we 
require to satisfy these conditions? In this paper, we establish 
a systematic framework that addresses these questions for 
parameter estimation based on signals corrupted by AWGN. 
Contributions: We first identify fundamental structural prop- 
erties for compressive estimation in AWGN, and then illustrate 
them by explicit computation for frequency estimation for a 
mixture of sinusoids. 

Isometries for estimation: Suppose we want to estimate a K 
dimensional parameter = (#i, 9i, . . . , 9k) from projections 
of an N dimensional signal x(0) in AWGN. When we make 
all N measurements, fundamental bounds on the estimation 
error variance, such as the Ziv-Zakai bound (ZZB) and the 
Cramer-Rao bound (CRB), relate the geometry of the signal 
manifold to the best achievable performance. From the ZZB, 
we can infer that "coarse" estimation depends on the pairwise 
distances ||x(0) - x(0')|| V0, 0', while the CRB teUs us that 
"fine" estimation depends on norms of linear combinations 
of the partial derivatives {dx/d9k} (vectors in the tangent 
plane, which are the limit of differences as — > 0')- We 
extend these observations to compressive estimation by re- 
placing the signal manifold x(0) by Ax(0), where A is 
the compressive measurement matrix containing the random 
projection weights. This identifies the isometries required 
to preserve the structure of the ZZB and CRB. We also 
note that compressive measurements lead to an SNR penalty 
of M/N, where M is the number of random projections. 
This is because each random projection captures 1/N of the 
signal energy on average (normalizing such that the noise 
variance is unchanged). Thus, our structural observations can 
be roughly stated as follows: if we ensure that the geometry 
is roughly unaltered after making random projections, then 
the performance with compressive measurements is as with 
all N measurements, except for an SNR penalty of M/N 
Specifically, we show that if the measurement matrix A 
satisfies the pairwise isometry property (PIP) (||Ax(0) — 
Ax(0')|| ~ ^/W/N\\yi(e) - x(0')!|), the ZZB with com- 
pressive measurements is approximately equal to the ZZB 
with all N measurements, except for the SNR penalty of 



MJN. We prove an analogous result for the CRB when A 
guarantees tangent plane isometry (|| A^ fc cikdx(9)/ddk\\ ~ 
^/M/N\\J2 k a k dx(0)/d6 k \\, Va fe ), which is a weaker re- 
quirement than pairwise isometry. 

Number of measurements: When the preceding isometries 
hold, we can use their relationship to the ZZB/CRB to obtain a 
tight prediction on the number of measurements necessary for 
successful compressive estimation. It is known that nonlinear 
estimation problems exhibit a threshold behavior with the 
SNR and that the convergence of the ZZB to the CRB with 
increasing SNR can predict the threshold. We employ this 
observation to predict the number of measurements required 
to avoid performance floors, since the the effective SNR with 
compressive measurements increases linearly with the number 
of measurements. 

Computations for sinusoidal mixtures: While the preceding 
results reveal the structure of compressive estimation, compu- 
tation of the number of measurements required to achieve the 
desired isometries and to avoid performance floors requires 
a problem-specific analysis. To this end, we consider the 
fundamental problem of frequency estimation for a mixture 
of sinusoids. For estimating K frequencies from N samples, 
we show that: (a) 0(K log NK5~~ l ) measurements suffice 
to provide tangent plane isometries, where S depends on 
the frequency separation, (b) 0(K log NKS^ 1 ) measurements 
suffice to provide pairwise isometries between two sets of 
frequencies u> = (wi, u>2, ■ ■ ■ ,^i<) and u>' = (u)[,u}' 2 , ■ ■ ■ ,oj' k ) 
that are "well-separated" (6 lower bounds the smallest singular 
value of an N x 2K matrix whose columns are the 2K 
sinusoids). We strengthen these results for a single sinusoid 
(K = 1), exploiting the continuity of the sinusoidal manifold 
to show that O(log-ZV) measurements suffice to guarantee 
pairwise isometry between sinusoids at any two frequencies 
u>,ui' (eliminating the "well-separated" requirement). We also 
show that the criterion for prediction of the number of mea- 
surements, based on the convergence of ZZB to CRB, is 
tight, by evaluating the performance of an algorithm which 
closely approximates the MAP estimator. The algorithm works 
in two stages: first, from a discrete set of frequencies, we 
pick the one that fits the observations best and, then, we 
perform local refinements using Newton's method. When the 
number of measurements is slightly smaller than the predicted 
number, we observe an estimation error floor. This error floor 
vanishes when we make the predicted number of compressive 
measurements. 



II. Related work 

The goal of standard compressed sensing J2], is to 
recover signals which are sparse over a finite basis with signif- 
icantly fewer measurements than the dimension of the obser- 
vation space. Signal recovery requires that the measurement 
matrix must satisfy the Restricted Isometry Property (RIP): 
the distance between any two sparse signals must be roughly 
invariant under the action of the matrix. If the RIP is satisfied, 
sparse signals can be recovered efficiently using techniques 
such as Orthogonal Matching Pursuit (OMP) and ^-norm 
minimization. Reference El used the Johnson-Lindenstrauss 



(JL) lemma to provide a simple proof that 0(K log N) random 
projections suffice to establish RIP for recovering K sparse 
vectors in R w . We briefly summarize the key ideas, since we 
use an analogous approach in establishing pairwise isometry 
for the mixture of sinusoids example discussed in this paper. 
The JL lemma states that, to approximately preserve the 
pairwise distances between P points after random projections 
(with the weights chosen from appropriate distributions, such 
as i.i.d. ±1 0|), we need 0(log P) such projections. However, 
to provide an RIP for compressive measurement matrices, 
the distances between any two A'-sparse vectors must be 
preserved. Since the number of such vectors is infinite, the 
JL lemma cannot be applied directly. However, the desired 
RIP result is established in RJ by discretizing the set of K- 
sparse vectors sufficiently finely, applying the JL lemma to the 
resulting discrete set of points, and then exploiting continuity 
to provide isometries for the remaining points. 

For compressive estimation of continuous-valued parame- 
ters, sparsity corresponds to the dimension of the parameter 
space being significantly smaller than that of the observation 
space. This problem was perhaps first investigated in |J6], 
which identifies that the analogue of the RIP here is the 
pairwise isometry property considered in the present paper. 
However, it does not relate this property to estimation-theoretic 
bounds as done here. Reference |6) also shows that compres- 
sive measurements guarantee pairwise e-isometry for a signal 
manifold with probability 1 — p, as long as the number of 
measurements M satisfies 



M = O (e- 2 \og(l /p)K log (NVRt-^- 1 )) , 



(1) 



where V, R, r are properties of the signal manifold (1/r is the 
condition number which is a generalization of the radius of 
curvature, R is the geodesic covering regularity and V is the 
volume). However, to the best of our knowledge, it is difficult 
to specify how {r, V, R} scale with the parameters N and K 
in general. In this paper, therefore, we provide a self-contained 
derivation of the number of measurements required to preserve 
these isometries when the signal manifold consists of a mixture 
of sinusoids in Section [VT] Compressive parameter estimation 
has also been studied in |7|; however, since the noise model 
there is adversarial, the results are pessimistic for many 
practical applications in which a Gaussian model for the noise 
is a good fit. 

Algorithms to estimate the frequencies in a mixture of 
sinusoids from compressive measurements are proposed and 
evaluated in J8), 0. Both of these papers assume that the 
sinusoids have a minimum frequency separation and that the 
frequencies come from an oversampled DFT grid. They pro- 
pose variants of standard compressed sensing algorithms, such 
as Orthogonal Matching Pursuit (OMP) and Iterative Hard 
Thresholding (IHT), which rely on the sinusoids' frequencies 
not being too close. As mentioned earlier, restricting the 
frequency estimation to a discrete grid in this fashion results in 
performance floors, as studied in great detail in [JTJ . However, 
as shown in this paper and in our earlier conference paper 
|[Tol . iflTl . it is possible to avoid such performance floors, and 
to attain the CRB, by local refinements based on Newton-like 
algorithms after grid-based coarse estimation. 



To the best of our knowledge, other than our conference 
paper [flOl , this is the first paper to relate isometry conditions 
to estimation-theoretic bounds for compressive parameter esti- 
mation, and to show that, in the AWGN setting, the only effect 
of compressive measurements (assuming that the pairwise 
isometry condition is satisfied) is an SNR penalty of M/N. 
This SNR penalty has been observed in specific scenarios such 
as the detection problem considered in |[T2l . but we believe 
that the present paper is the first to establish it for the general 
problem of parameter estimation in AWGN. 

This paper goes beyond the results in our conference paper 
iflOl in multiple ways. First, we establish a connection between 
the pairwise isometry property and the Ziv-Zakai bound. 
We then show how the connections between the ZZB and 
CRB, together with the isometry conditions, can be used to 
predict the number of measurements required for accurate 
compressive estimation. We also characterize the number of 
measurements needed to provide isometry guarantees for a 
mixture of sinusoids unlike IfTUl , which only deals with a 
single sinusoid. Finally, the isometry guarantees we provided 
in ifTOl for a pair of sinusoids required their frequencies to be 
"well-separated". Here, we close the gap and provide such an 
isometry for any pair of frequencies. 

In the algorithm description and numerical illustrations in 
this paper, we restrict attention to a single sinusoid in order to 
illustrate the fundamental features of compressive estimation. 
However, as described in detail in our conference papers 
IfTUl , iPTD . our algorithmic approach (discrete grid followed by 
Newton refinement) extends easily to estimate the frequencies 
of multiple sinusoids. While the latter is a canonical problem 
of fundamental interest, it is worth noting that an important 
application that motivates us is the problem of adapting 
very large antenna arrays ifPJI . ifPTl . Compressive parameter 
estimation in this context exploits the relatively small number 
of dominant multipath rays in order to estimate the spatial 
frequencies (and hence the angles of arrival) for these rays. 

Outline: We begin in Section |lll] by stating the compressive 
parameter estimation problem in AWGN and the isometry 
properties needed for successful estimation. The relationship 
between these isometry properties and the estimation error 
bounds (CRB/ZZB) is brought out in Section [TV] In Section 
fVl we consider the problem of estimating the frequency of 
a sinusoid. We show how the convergence of the ZZB to the 
CRB at high SNR can predict the number of compressive mea- 
surements needed to avoid error floors. Section IVI I derives the 
number of measurements needed to guarantee these isometry 
conditions for the problem of frequency estimation from a 
mixture of K sinusoids, while Section IVIII focusses on the 
number of measurements needed for the single sinusoid case 
(K = 1). 



III. Compressive measurements 

Consider an A^-dimensional signal x <G C^ that is 
parametrized by a A"-dimensional real-valued quantity 6 = 
[81 ■ ■ ■ 9k] with a prior distribution p(9). We investigate 
the problem of compressively estimating from M random 



projections (M <C N) of the signal manifold x(0): 

M = wf (x(0) + Zi) , l<i<M, 

where the elements of each projection vector w, are drawn 
i.i.d. from {±.l/y/~N, ±j/y/N} with equal probability. Fur- 
thermore, we assume that the measurement noise Zj ~ 
CA/"(0,cr 2 Ijv) and is independent across time. Vectorizing the 
above equation, we have 



y = Ax(0) + z, 



(2) 



where A = [wi • • • w« ] T is the measurement matrix and 
z = [wfzi ... wJjZm] 7, is the noise. Since ||wi|| 2 = 1 
and the noise vectors z, are independent across time, we have 
z-CA/"(0,cr 2 I M ). 

If the matrix A satisfies certain isometry conditions, then 
we can successfully estimate 9 from M <C N measurements. 
We first explain why these conditions are helpful intuitively 
and then define them formally. 

The Maximum Likelihood (ML) estimator of 9 for the 
model in (O is given by 

9 = argmin ||y — Ax(0')|| 

0' 

= argmin ||Ax(0) - Ax(0') + z|| . 

0' 

If the number of measurements is too small and A has a 
large nullspace, it is possible that ||A(x(0) — x(0')) || ~ 
even when ||x(0) — x(0')|| is large. Thus, with small amounts 
of noise z, the optimizing parameter 9 could be drastically 
different from the true parameter 0, resulting in large errors. 
This problem can be avoided if the matrix A preserves 
the geometry of the estimation problem by ensuring that 
the distance between x(0) and x(0') remains approximately 
unaltered under its action. Specifically, if we have, 

||A(x(0)-x(0'))||<x||x(0)-x(0')|| V0,0', 

we see from (|2]i that the ML estimate at high SNR from 
M compressive measurements will roughly coincide with the 
estimate we would have obtained if we had access to all 
N measurements of x(0). The pairwise e-isometry property 
captures this idea of distance preservation precisely. 
Pairwise e-isometry property: The matrix A satisfies the 
pairwise e-isometry property (e < 1) for the signal model 
x(0) if 



■e)< 



|Ax(0i)- Ax(0 



|X(0 1 )-X(0 : 




(l+e)V0!,0 2 . 
(3) 



We now explain the reason for the isometry constants 
y/M/N(l - e) and y/M/N(l + e). Consider a single ran- 
dom projection of a signal v onto the weights w t that are 
chosen i.i.d. from {±1 / y/N , ±j / y/N} . The average energy 
in the projection is 1/N of the energy in the signal v: 
E wjv = (1/iV) ||v|| . Thus, M compressive measure- 
ments capture M/N of the signal energy on the average: 
E||Av|| 2 = (AI/N)\\-v\\ 2 . For large enough M, we can apply 
the Law of Large Numbers (LLN) to conclude that, for a 
particular realization of the measurement matrix, ||Av|| 2 ap- 
proaches its expected value (M/iV)||v|| 2 with high probability. 



Thus, for compressive measurements, it is natural to define the 
pairwise isometry property with the constants -JM/N(1 — e) 
and y/M/N(l + e). 

We note that a particular instance of a randomly generated 
measurement matrix need not satisfy the pairwise isometry 
property for the signal manifold x(0). However, when the 
number of measurements M is sufficiently large, |6| shows 
that the the pairwise e-isometry property can be satisfied with 
arbitrarily high probability. 

A weaker notion of distance preservation is the tangent 
plane isometry property that is particularly useful when we 
wish to refine an estimate that is "close" to the true 
parameter value. In this case, since we are interested only 
in the ML cost surface around the true parameter 0, it suffices 
to preserve the the geometry of the estimation problem in the 
vicinity of by ensuring that the distances between x(0') and 
x(0) for 8' —> are preserved under the action of A. This 
is captured by the tangent plane isometry property defined as 
follows. 

Tangent plane e-isometry property: The matrix A satisfies 
the tangent plane e-isometry property (e < 1) for the signal 
model x(0) if 



•e < 



\AJ2a m (dx(e)/d8 ri 



\Y l a m {dx{6)/de m )\\ 

for all a = (ai, 02, ... , cik) € K K \{0} and parameters 0. 

By letting 2 —* 0i in the definition of the pairwise e- 
isometry property, we see that a matrix A which satisfies 
the pairwise isometry property for the signal model x(0) 
also satisfies the tangent plane isometry, thereby confirming 
that tangent plane isometry is a weaker notion of distance 
preservation. 

In the next section, we show that these isometries suffice 
to preserve fundamental estimation-theoretic quantities such 
as the Cramer Rao bound and the Ziv-Zakai bound, which 
characterize the performance of estimators in very general 
settings. 

IV. Relating the isometries to estimation bounds 

Consider the general problem of estimating from L 
measurements 



y = Bx(fl) + z, 0<G6 



(5) 



where B is any L x N complex-valued matrix and z ~ 
CJ\f(0, <7 2 I). The compressive estimation problem is subsumed 
in this model (obtained by setting B = A, whose elements are 
chosen i.i.d. from {±l/\/N~, ±j/y/~N}), as is the more con- 
ventional problem of estimating from all N measurements 
(obtained by setting B — In, the NxN identity matrix). Note 
that, in both these cases, the per-measurement SNR E,\ijk \ 2 /a 2 
is the same, since the rows of the compressive measurement 
matrix A and the identity matrix Ijv both have unit norm (in 
the £ 2 sense). 

Theorems 1 and 2 connect fundamental estimation-theoretic 
bounds to the isometries defined in the previous section. The 
Ziv-Zakai Bound (ZZB) is known to be an accurate predictor 
of best possible estimation performance over a wide range of 



SNRs. Roughly speaking, it takes into account two sources 
of error: coarse error, when the estimate is not close to the 
true value of the parameter (essentially, making an error in 
hypothesis testing after binning the parameter space); and fine- 
grained error (the mean squared error from the true value when 
the estimate is in the right bin). We show in this section how 
the ZZB is related to the pairwise isometry property (Theorem 
2). The ZZB depends on the matrix B only through the set of 
pairwise SNRs ||Bx(0i)-JBx(0 2 )|| 2 /cr 2 V0 X ,0 2 G 6. There- 
fore, when the compressive measurement matrix A satisfies 
the pairwise isometry property (|3}, the ZZB with compressive 
measurements (B = A) is approximately the same as the ZZB 
with all N measurements, but at an SNR penalty of M/N 
(B = JW/N l N ). 

At high SNR, the probability of the estimate falling into 
the wrong bin becomes negligible, and the Cramer-Rao bound 
(CRB), which characterizes only fine-grained error, provides 
an excellent prediction of performance, while being easier to 
compute than the ZZB (the ZZB converges to the CRB when 
the SNR is high enough). We relate the CRB to the tangent 
plane isometry property in this section (Theorem 1). The CRB 
depends on the measurement matrix only through norms of 
the vectors B^2 a m (dx.(0)/d9 m ). Thus, if A satisfies the 
tangent-plane isometry (0J, the CRB with M compressive 
measurements is approximately equal to the CRB with all N 
measurements, but at an SNR that is lower by M/N. 

While the connections established here between estimation- 
theoretic bounds and the corresponding isometries apply gen- 
erally to compressive estimation in AWGN, showing that these 
isometries indeed hold requires a problem-specific analysis, 
as we illustate for sinusoidal mixtures in later sections. As 
with standard compressed sensing, the goal of such analyses 
is to characterize the number of measurements required for 
such isometries to hold with high probability for random 
measurement matrices. 

A. Cramer-Rao Bound 

Consider any estimator 0(y) with an error co variance matrix 
R(9) for the model in ©. The Bayesian Cramer-Rao Lower 
Bound (CRB) on the estimation error covariance is given by 
R{0) y F(B)- 1 , where F(B) is the Fisher Information 
Matrix (FIM) whose (m, rt)fh element is 



Fm,n(B) = 




dx(0)\ H dx(6)\' 
d0 m J d9 n J 

dlnp(0)dlnp(8)\ 



de n / 



(6) 



where 5R{a} denotes the real part of the complex number a 
and Kg denotes an expectation taken over the prior distribution 
p(0) of the parameters to be estimated. We now show that, if 
A satisfies the tangent plane isometry property, the FIM with 
compressive measurements F(A) is approximately equal to 
the FIM with all N measurements observed at an SNR lower 
by a factor of M/N. 

Theorem 1. Let A be an M x N measurement matrix 
with elements chosen i.i.d. from Uniform{zkl/\/N,±j /y/N}. 



Suppose that A satisfies the tangent plane e-isometry property 
(0 for the signal manifold x(0). Then the Fisher Information 
Matrix F(A), with compressive measurements as in (O is 
related to the FIM with all N measurements as follows: 




(1 - e)Ijv < F{A) ■< F 




(l + e)Ijy 



(7) 



Proof: Consider the quadratic form a T F(A)a for any 



a = [oi • • • a K ] T e R K . We have 



a 1 F(A)a = —I 



A^g, 



dx(6) 



rci.n 



d9 n 

d\np{6) d\np{6) 



ae n 



de n 



Since A satisfies the tangent plane e-isometry property (|4) for 
the signal model x(0), we have 



a^o, 



<9x(0) 



90, 



< — (1 + e 2 



dxffi) 



Z^' 



00* 



V0 



Multiplying both sides by 2/ct 2 and taking expectation over 
6, we see that the LHS is the first term in the expansion of 
& T F(A)a., while the RHS is the corresponding term in an 

analogous expansion of a T F I \/ j^ (1 + e)I/v) a - Since the 
second terms in both of these quadratic forms are the same, 
we get 



a T F(A)a < a T F 




Va. 




This establishes the required bound on F(A) in one direction. 
The proof for the other direction is analogous. ■ 

Remark: In many problems, we are interested in a pointwise 
CRB which sets limits on the estimation performance as a 
function of the true parameter dtrue- The pointwise CRB at 
6 can be obtained from the pointwise FIM whose (m, n)th 
element is given by 



^m,nv-^i ") 



By analyzing the quadratic forms of the pointwise FIM as 
before, we can see that when A provides an isometry for the 
tangent plane at 6 trU e alone, the pointwise CRB at 9 true is 
preserved (up to SNR penalty). 

B. Ziv-Zakai Bound 

We state the ZZB and relate it to the pairwise isometry 
property. Since the ZZB is not as widely used as the CRB, 
we provide a brief review in Appendix lAl 
ZZB for the estimation problem (0: Consider any estimator 
6(y). The ZZB establishes a lower bound on the mean-squared 
error in estimating & T 6, i.e. E|a T (0(y) — 6)\ 2 , which we 
denote by Z(B,a). The expression for Z(B,a), which we 
will state next, is complicated. However, all we need is the 
fact that the measurement matrix appears in Z(B, a) only 



through pairwise SNRs d 2 B {0\, 6 2 )/a 2 where d B {8\,6 2 ) = 
||Bx(0i) - Bx(0 2 )||. 
The ZZB is given by 

Z{B,a) = \ f°v{ max f (p(d>) +p(cf> + 5)) 

/(J3, 0, </> + <5) d<p\hdh, (8) 

where V{ } is the valley filling operation, defined as 
V{q(h)} = max r > q(h + r), and f(B, 61,62) is the proba- 
bility of error for the optimal detection rule in the following 
problem: 

Hl : y = Bx(0 1 )+z Pr(^) = ^gfey 
H 2 : y = Bx(6/ 2 )+z Pr(H 2 ) = 

Since z - CAf(0, <r 2 I), f(B, 6 1 ,6 2 ) is given by 



_ p(9 2 ) 

p(0i)+p(0a) ' 



Pr(ifi)$ 



d B (6 u 6 2 ) 



V2a 



V2cr 



111 



p(fll) 



d B {pi,6 2 ) P(6 2 ) 



Pr(iJ 2 )$ 



d B (6i,8 2 ) 

s/2o 



y/2cr 



In 



P(6i 



d B (6i,6 2 ) P (6 2 



where $( ) is the CDF of the standard normal distribution. 

To draw the connection with pairwise isometry, we note that 
Z(B,a) depends on the measurement matrix B through the 
probability of error f(B, 9\, 9 2 ), which in turn only depends 
on pairwise distances d B (9i,6 2 ). Thus, when the compres- 
sive measurement matrix A guarantees pairwise isometries, 
f(A,6 1 ,6 2 ) « f(y/M/N l N ,6i,6 2 ). Using this, we show 
that the ZZB with compressive measurements is identical to 
the ZZB with all N measurements except for an SNR penalty 
of M/N. 

Theorem 2. Let A be an M x N measurement matrix 
with elements chosen i.i.d. from Uniform{ztl/yN, ±j/yN}. 
Suppose that A satisfies the pairwise e-isometry property (0 
for the signal manifold x(0). Then, the ZZB Z{A, a), with the 
compressive measurements in (fJJ), is related to the ZZB with 
all N measurements Z(Ijy,a) as 




<Z(A >a )<z(^/E(l-e)I N ,: 



Proof: The probability of detection error f(A, 61, 6 2 ) 
is a strictly decreasing function of the distance between 
the hypotheses dA{6\,6 2 ). Since A satisfies the pairwise e- 
isometry property ((3), we have 



d A (6i,6 2 )< 



\M0i)-M6 2 ) 



Combining these facts, we get f{A, 6 1 ,6 2 ) > f(y/M/N(l + 
e)Ijv, 61, 6 2 ), which is the probability of detection error with 
all N measurements, but with an SNR reduced by a factor of 
(M/N)(l + e). Substituting these pointwise bounds in (|8), we 

get Z (a/^(1 + e)Ijv, a] < Z (A, a). The other inequality 
can be proved similarly. ■ 



C. Number of measurements needed 

The ZZB converges to the CRB when the SNR is suffi- 
ciently high and "large" errors are unlikely, which is exactly 
when we would declare estimation of a continuous-valued 
parameter to be successful. For compressive estimation, the 
SNR depends (linearly) on the number of measurements M. 
Thus, we can predict the number of measurements required for 
successful estimation based on the convergence of the ZZB to 
the CRB. 

First, we note that for successful estimation, the number of 
measurements M must certainly be large enough for the matrix 
A to satisfy the pairwise isometry property. When A satisfies 
the pairwise isometry property, the CRB and ZZB with M 
measurements are well approximated by F(y/ Al /N In) and 
Z(y/M/N Ijv, a) respectively. The number of measurements 
M at which a^F^^M/N l N )a « Z(y/M/NI N , a) Va e 
R K is where compressive estimation is likely to be successful. 

In the next section, we illustrate these ideas in the context 
of the fundamental problem of estimating the frequency of a 
single sinusoid. 

V. Designing compressive estimation strategies 

In this section, we illustrate, using the example of fre- 
quency and phase estimation for a single sinusoid, how to 
apply the preceding results to design compressive estimation 
strategies. We describe an algorithm which attains the CRB 
given "enough" compressive measurements, and show how to 
determine how many measurements are enough, based on the 
convergence of the ZZB to the CRB. We implicitly assume that 
we have enough measurements for the appropriate isometries 
to hold; detailed analytical characterization of the number of 
measurements required for this purpose is deferred to later 
sections. 

The measurements are given by 



y = e^Bx(u) + z 

-ju(N-l)/2 p -ju(N-3)/2 



(9) 



Ju(N-l)/2] 



where x(w) = [e -j«^'-^/- e -j~^--^/~ . e j 
is an TV-dimensional sinusoid with frequency uj, <fi is its 
phase, B is an M x N complex valued measurement ma- 
trix and z ~ CN(0,a 2 ). The parameters to be estimated 
(j> and uj are both distributed uniformly over [0,27r]. Note 
that there is a slight change in notation from the previ- 
ous section. Earlier, we denoted the parameter to be esti- 



mated by 9 = [u 



and the signal manifold x(0) 



i4> \ p -MN-l)/2 -ju(N-3)/2 



Juj(N-1)/2] 



We now 



separate the contributions from the phase and frequency and 
use x(w) to denote a sinusoid with frequency u and zero phase 

(<t> = 0). 

When we make all N measurements (setting B = In in 
©), the CRB and the ZZB are well known. The FIM in 
estimating = [u <j>] is shown in |[T4l to be 

~N(N 2 - 1)/12 0" 
N ' 



HIn) = -=j 



(10) 



In particular, the CRB on the variance of the frequency 
estimate (computed as a T F~ 1 (I^r)a with a = [1 0]) is 
CRB(Iat) = 6a 2 /(N(N 2 - 1)). The ZZB on the variance 




Number of compressive measurements 



Fig. 1. Lower bounds on estimation error as a function of the number of 
compressive measurements M at a per sample SNR l/o" 2 of — 6dB for an 
N = 1024 sinusoid. 



of the frequency estimate (choosing a = [1 0] in dHJ) is given 
by 



Z(I n ,el) 



||x(0)-e^'x(/i)|| 
V i max $ - = -—^- > h dh 



0'e[O,27r] 



V2a 




sin(7V7i/2) 



Nsm(h/2) 



h dh. 



Suppose now that we make M compressive measurements, 
choosing M large enough so that the measurement ma- 
trix A satisfies the pairwise e-isometry property for the 
{e J< ^x(u;)} signal model (the number of compressive measure- 
ments needed to establish pairwise isometries for this signal 
model is analytically characterized in Section IVIIt . Then, 
from Section [IV] we know that the Fisher information with 
compressive measurements F (A) is well-approximated by 
F ( y/M/N In ) , the Fisher information with all N measure- 
ments at an M/N SNR penalty. Given that we know F(I/v), 
computing F(yJ AI/N In) is easy: we simply replace a 2 in 
([Tot by a 2 (N/M) (the observations in (0 with a measurement 
matrix -^/M/NIn at a noise level a 2 are equivalent to 
observations with a measurement matrix Ijy at a noise level 
a 2 (N/M)). This argument holds true for the ZZB too. Thus, 
we get the CRB and the ZZB with compressive measurements 
to be 



CRB(A) « CRB (V M/N I N ) = 6a 2 /(M(N 2 - 1)) 



Z(A,*)*iZ(y/M/NI N ,BL) 




sm(Nh/2) 



Nsin(h/2) 



h dh. 



For this problem, we can show that the above expression for 
the CRB is actually exact, while the expression for the ZZB 
is a lower bound on the exact ZZB (i.e. Z(y/M/N In, a) < 
Z(A, a)), and is therefore also a lower bound on the mean 
square error (MSE). 

We now illustrate how to predict the number of measure- 
ments needed for successful compressive estimation based on 
the convergence of the ZZB to the CRB. Consider frequency 



estimation with N — 1024 samples with per-sample SNR 
= 1/cr 2 = -6 dB. In FigureQ] we plot the CRB and the ZZB 
as a function of the number of compressive measurements 
M. We notice a clear separation between the two bounds at 
M = 130, and that they converge at about M = 160. This 
suggests that we need more than 130 measurements, and that 
M = 160 measurements should suffice. 

We now describe an algorithm whose performance closely 
follows these predictions: the algorithm achieves the CRB 
with 160 measurements but has large estimation errors with 
M = 130 measurements. This illustrates the efficiency of the 
algorithm, as well as the accuracy of our design guideline of 
"sufficient effective SNR." 

Algorithm: Suppose that for the purposes of algorithm design, 
we ignore the fact that the unknown phase rotation e J * has unit 
amplitude and estimate the complex gain g and the frequency 
u) according to the model 

y = gAx(u) + z. 

The ML estimates of the gain and frequency (g,&) are 
obtained by optimizing the function 

S{g, u) = M {y gAx.(u))) , 

over g <G C, to <G [0, 2tt] and 5R{a} denotes the real part 
of the complex number a. Performing a direct optimization 
over g and cj is difficult. Therefore, we resort to a two stage 
procedure, consisting of a detection phase and a refinement 
phase, which we describe now. 

(i) Detection phase: First, we notice that for any to, the 
optimizing g is given by (Ax(w)) y/||j4x(u;)|| 2 . Substi- 
tuting this in the cost function S(g, u), we see that the 
ML estimate of the frequency Cj should optimize G(lu) = 
max 9eC S(g,u) = 0.5\y H Ax(uj)\ 2 /\\Ax(oj)\\ 2 . We obtain a 
coarse frequency estimate by discretizing the frequencies uni- 
formly into a set F = {0, 2tt/{2N), ..., 2ir(2N-l)/(2N)} of 
size 2N and then choosing q* <G F that maximizes G(q), q E 
F. Since the frequency estimation error is substantial (on the 
order of 1/N), we call this the detection phase. The gain 
estimate is given by g = (Ax.(q*)) y/|| Ax(g*)|| 2 

(ii) Refinement phase: In the second stage, we iteratively 
refine the gain and frequency estimates. Suppose that after the 
nth round of optimization, the gain and frequency estimates 
are given by g n and u>„ respectively (starting off with the 
estimates from the detection phase). In the n + 1th round, 
we refine the frequency estimate by fixing the gain to g n 
and locally optimizing S(g n ,u>) around uj n using Newton's 

method: 

dS(g n ,u} n )/duj 



Wn+l 



where 



dS(g,u) 

duj 

d 2 S(g,uj) 

duj 2 



= 5R 



= 3? 



d 2 S{g n ,u} n )/duj 2 



(y - sAx(w)) gA — 

t a t \\ H a d 2 *(u) 

(y - gAx(u;)) gA 



dx(u) 



duj 



Next, fixing the frequency estimate to <2> n +i, we get the 
updated gain after the n + 1th round to be g n +i = 
(Ax(ib n+1 )) H y) /\\Ax(6j n+1 )\\ 2 . We typically perform ten 
such rounds of iterative optimization. 

Results: We simulate the performance of the algorithm with 
M = 130 and M = 160 measurements across 2 x 10 4 
trials at a per sample SNR of — 6dB (for each M, we use 
the same measurement matrix A for all the trials). We plot 
the complementary CDF (CCDF) of the squared frequency 
estimation error for M = 130 in Figure [2(a)| and for M = 160 
in Figure |2(b)| We also plot the corresponding CRB and ZZB 
on the variance of the frequency estimate. With M = 130 
measurements, we notice a performance floor due to large 
estimation errors, which disappears when we make M = 160 
measurements. This is consistent with our earlier prediction 
based on the convergence of the ZZB to the CRB. 

However, there are two possibilities for the large estimation 
errors with M = 130 measurements: (a) the measurement 
matrix A distorts the geometry of the estimation problem (by 
not satisfying the pairwise isometry property, in which case 
the approximations used to compute the ZZB are invalid) or 
(b) the number of measurements are sufficient to preserve the 
geometry, but the SNR is insufficient. We show now that it 
is actually the insufficient SNR which causes large errors. In 
order to do this, we run simulations where we make M = 
130 measurements, with per-sample SNR boosted by a factor 
160/130 to -5.1dB (ensuring that the overall SNR M/a 2 
is identical to a system with 160 measurements at SNR = 
— 6dB). The results are plotted in Figure |2(c)| and we do not 
see any large errors. This illustrates that it was insufficient 
SNR, rather than distortion of the geometry, which resulted 
in large errors with M = 130 at per-sample SNR of — 6dB. 
For further numerical evidence, we compute the ratio of the 
MSE to the CRB: when the per-sample SNR is -5.1dB, we 
essentially attain the CRB, with MSE/CRB = 0.3dB; however, 
when the per-sample SNR is — 6dB, MSE/CRB is a massive 
38dB. 

VI. Isometry conditions for frequency estimation 

FROM COMPRESSIVE MEASUREMENTS 

In the single sinusoid example in the previous section, we 
assume that there are enough measurements to guarantee the 
required isometries. In this section, we seek to analytically 
characterize the number of measurements required to pro- 
vide such guarantees. We show that, for a mixture of K 
sinusoids, the number of measurements required depends on 
the conditioning of appropriately defined matrices, which in 
turn depends on the separation between the frequencies in the 
mixture. We go back to a single sinusoid, for which we can 
prove stronger results, in the next section. 

Consider a signal manifold which is a sum of K complex 
sinusoids J2k=i 3fc x (wfc)> where gk € C are complex gains 
and 

x(w)= \h x e-^ N - 1 ^ 2 ■■■ h N e^ N ^/ 2 Y (11) 

is a windowed sinusoid, with the window weights given by 
{h n } (the sinusoid in the previous section is a special case, 
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(a) Scenario: 1/cr 2 = -6dB, M = 130. 
MSE gap to the CRB is 38dB 
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(b) Scenario: 1/cr 2 = -6dB, M = 160. 
MSE gap to the CRB is 0.3dB 
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(c) Scenario: 1/cr 2 = -5.1dB, M = 130. 
MSE gap to the CRB is 0.3dB 



Fig. 2. The CCDFs of squared estimation errors from 2 X 10 4 simulation runs for per sample SNR 1/cr 2 for an N = 1024 sinusoid from M compressive 
measurements 



where the window used is all-ones). Without loss of generality, 
we assume that the window weights are normalized so that 
Tn \hn\ = 1- To avoid trivialities, we assume that more than 
one entry among the h n 's is non-zero. 

Suppose that we make M compressive measurements of the 
form 



l=K 

i=i 



giyi{uji) + z. 



9K 



T and the 



and we wish to estimate the gains g = [g\ 

frequencies u> = [wi ■ • ■ ujk] T - 

Tangent plane isometry for a mixture of K sinusoids: Our 

first goal is to quantify the number of measurements needed 
to preserve the CRB (pointwise) at u>. We show that this 
is equivalent to guaranteeing e-isometry for a set of tangent 
planes as follows. For any specific value of the unknown 
parameters - gain magnitude {|.gz|}, phases {ffz/|gz|} and 
frequencies {w;} (we split the complex gain in this manner 
in order to restrict attention to real parameters) - Theorem 
Q] guarantees that the CRB can be preserved by ensuring e- 
isometry for the plane tangent to the manifold at this set of 
parameters. Therefore, to preserve the CRB at us, we need to 
guarantee e-isometry for tangent-planes for all values that the 
gain magnitudes {\gi\} and the phases {gz/|g;|} can take. We 
can show that the union of all such tangent planes is a subset 
of the span of the matrix T (<*>), defined as 



T(u>) 



x(wi) ■ • • x(wr-) t 



dx(wi) 



_dx.(u K ) 



doj 



(12) 

where r = l/||dx(u>)/<ia;|| (note that r does not depend on u>). 
Therefore, if the compressive measurement matrix A satisfies 




e < 



|AT(w)q|| 




Vq, 



N y ' ~ ||T(w)q|| ~ \ N 

we can preserve the CRB (up to the SNR penalty) for a given 
set of frequencies uj. Furthermore, if the above relationship 
holds, we say that A satisfies the tangent plane e-isometry 
property at u). 

Our first result is to show that the smallest singular value 
of the matrix T(u>), given by 5 = min q ||T(u;)q||/||q||, 
compactly characterizes the number of measurements needed 
to preserve tangent plane e-isometry. Specifically, if At (5) 



denotes the set of all frequencies u> for which the smallest 
singular value of T(u>) is larger than S, then we can guarantee 
tangent plane e-isometry for this set of frequencies with 
M = (e~ 2 K\og (NKe- 1 5~ 1 )) measurements. 

Theorem 3. Let A be an M x N measurement matrix whose 
entries are drawn i.i.d. from Uniform {±l/vN,ztj /yN}. 
Let T(oj) denote the tangent plane matrix ( 1721 ) of sinusoids 



with frequencies U) = (u>i . . . Uk) & 



?i< 



Let A T (S) = {u> 



smallest singular value of T(w) > 6}. Then, for any e > 0, 
we have 



l-e< 



N \\AT(u)q\\ 



M ||TMq|| ^ + £ Vq e C-,«€A T («) 
with high probability when M = 0(e~ 2 K\og(NK e _1 5 -1 )). 

Remarks: 

• When two among the K frequencies, say w, and uij, are close 
to one another, the columns x(wi) and x(u>j) are approxi- 
mately equal (as are the columns dx(wj) / dui and dx(uij)/du>) 
and the matrix T(w) is poorly conditioned. In such cases, 
the smallest singular value <5 can be very small, so that a 
large number of measurements is required to guarantee tangent 
plane isometry and preserve the CRB. This intuitively pleasing 
result on the difficulty of estimating closely spaced frequencies 
applies even when we make a full set of N measurements, 
and is in line with prior work. For example, (TT31 requires 
the sinusoids to be separated by at least four times the DFT 
spacing of 2tt/N for total variation minimization to succeed 
with N measurements. 

• The singular values of T(u>) are well known to be the square 
roots of the eigenvalues of T h (uj)T(uj). By expanding out 
T H (u;)T(a;), we can show that each entry depends only the 
set of frequency differences Ui — u>j, 1 < i, j < K. Therefore, 
8 also depends only on the set of frequency differences. 
However, for practical estimation problems, we would like to 
characterize <5 in more detail. For example, a typical question 
might be: given that the smallest frequency spacing between 
any two of the K sinusoids is at least Acj, how many 
measurements do we need to preserve the CRB (up to the SNR 
penalty)? In order to answer this using Theorem [3] we need a 
lower bound on 5 for any tangent plane matrix T(w) when the 
pairwise frequency separation is at least Aw. We leave this as 



an open question for the general setting of K sinusoids, but 
are able to provide a concrete answer for a single sinusoid in 
Section IVIII 

Pairwise isometry for a mixture of K sinusoids: Consider 
now the problem of quantifying the number of measurements 
needed to guarantee pairwise e-isometry for a mixture of 
K sinusoids. We denote the matrix containing the sinusoids 
[x(wi) x(w2) . ..x(wk-)] by X (<*>). From the definition of 
pairwise isometry in Section Hill compressive measurements 
must preserve the ML cost structure, thereby implying that 



the set from which Gi is chosen has changed). We can show 



that M 



0\ 



|AX(w)g-AX(w')g'l 



|X(w)g-X(a/)g'| 



for pairs of (g, u) and (g',u/) of interest. We are typically 
interested in all values of the gains g, g' but may restrict the 
set of frequencies u> and a/ to each come from a set (for 
example, the set of K frequencies that are separated pairwise 
by at least Au). 

To simplify the problem, we only consider u> and u/ that 
are "well-separated" (we comment on why this helps later). 
For example, we may restrict u' to <d'(u>) = Q\B(uj,p,), 
where B(u,fi) is a small ball of frequencies around u>. (A 
possible definition for the ball B(u,fi) can be B(oj,p) = 
{u>' : miiii<ij<A' \<jj[ — oo j\ < p.})- Suppose that we make 
enough measurements to guarantee pairwise e-isometry for all 
u> G and u>' G Q'(u), no matter what value u> takes. 
This implies that for any set of frequencies u> G 0, we 
have preserved the cost-structure of the estimation problem 
at hypothesis frequencies u) 1 that are "far-away" (<*/ outside 
B(u),fj,)). Roughly, a good estimation algorithm should not 
incur frequency errors larger than [i. 

We introduce some notation that will simplify the following 
discussion. Let 6j = [u> u>'} , g = [g — g'] denote vectors of 
length 2K concatenating the gains and frequencies. Also let 
X(cl>) = [X(a>) X(o/)] denote the N x 2K matrix containing 
all the sinusoids. Note that g can take any value in C 2K 
but d> has a special structure: its first K entries u> must 
belong to and its last K entries come from a set 6' (<*>) 
that depend on the first K values. As shorthand, we say that 
w G = {[oJ <*>] : OJ G 0,w G 0'(u>)}. With this notation, 
the above pairwise isometry condition for a mixture of K 
sinusoids, which we desire can be written (more formally) as 




2 A' 



(13) 



for a particular <I> <G 0. If the matrix A satisfies this relation- 
ship, we say that A guarantees e-isometry (just isometry, not 
pairwise) at u> (2K sinusoids). 

Our goal is to quantify the number of measurements nec- 
essary for (Q~3) to hold for all Cj G 0. While solving this 
problem in its entirety is difficult, we can break it down into 
two subproblems, the first of which we tackle. We explain the 
solution to this subproblem and then comment on the other. 
In analogy to the previous section, let A p (6) denote the set of 
all frequencies u> (chosen from anywhere in M. 2K , not just 0) 
such that the smallest singular value of X(£>) is at least as 
large as 5. Suppose that we want A to guarantee e-isometry 
for all u> G A p (<5) (same relationship as ( TT3l except that 



\2K)\og(N(2K)e- 1 d- 1 )) 



measurements 



suffice to provide such a guarantee with high probability. 

Theorem 4. Suppose that A is an M X N measure- 
ment matrix whose entries are drawn i.i.d. from Uniform 
{±l/VN,±j/VN}. Let X(w) = [x(wi) x(w 2 ) ... x(w L )] 
denote an N x L matrix of sinusoids with oj = (wi . . . ljl) G 
R L . Let A p ((5) = {u> : smallest singular value of X(u>) is 
greater than or equal to 5}. For any e > and 5 > 0, we 
have 



1 -e < 



TV ||AX(w)g| 



< 1 + e VgeC l ,we A p (<5), (14) 



M ||X(w)g|| 
with high probability when M = O (e~ 2 L\og (NLe~ 1 8~ 1 )). 

Remarks: 

• Returning to the problem posed in (fT3l l, suppose that 
the smallest singular value of X(d>), further minimized 
over all values of u G is u min > 0. Then, 
is contained in A p (cr mi „) and using Theorem |4] M = 
O (e~ 2 (2K)log (N(2K)e~ 1 a^ n 1 in j) measurements suffice to 
guarantee the required e-isometry. We leave the question of 
quantifying cr m i n for a given choice of (say the set of K 
frequencies that are separated pairwise by at least Aw) as an 
open problem, but explicitly compute it for the simpler setting 
of a single sinusoid in the next section. 

• The previous remark also explains why we choose to restrict 
a/ to 0'(w) = Q\B(u3,p,). The singular value of X(<D) 
when u;,u/ G can be made arbitrarily small by allowing 
iv' — > u>. Thus, in this case, we cannot directly use Theorem 
H]to quantify the number of measurements required. However, 
this does not necessarily mean that an isometry cannot be 
provided for closely spaced sinusoids. Indeed, we show in 
the next section that for K = 1, it is possible to provide an 
isometry no matter how close w and uj' get. 

A. Proof of Theorems \3\and^\ 

We give a proof of Theorem |4] along the lines of the proof in 
(6), where the authors extend the JL lemma (which gives the 
number of compressive measurements needed to preserve the 
geometry of a discrete point cloud) to a manifold by sampling 
the manifold and exploiting its continuity. Details of the proof 
can be found in Appendix [B] A similar proof can be given for 
Theorem [3] which we briefly sketch in Appendix ICl 

VII. Pairwise isometry for frequency estimation 

OF A SINGLE SINUSOID 

In the previous section, we quantify the number of mea- 
surements needed to give pairwise isometries for a mixture 
of K sinusoids in two distinct regimes: when the frequencies 
(u>,a/) are "far apart" and in the limit of u/ — > u (tangent 
plane isometries). We now consider a single sinusoid (K = 1) 
and provide pairwise isometries for all frequency pairs. In 
order to do this, we consider two regimes of frequency pairs 
(wi,W2): closely spaced and well-separated. For the set of 
well-separated frequencies, say {(wi,^) : |u>i — Wa| > M}> 
we obtain a bound on the smallest singular value of X(w) = 



10 



[x(wi) x(w2)] and use it in Theorem |4] to immediately infer 
the number of measurements needed to guarantee pairwise e- 
isometry for sinusoids from this set. The challenge then is 
in providing a similar result for sinusoids whose frequencies 
are separated by less than \x. We solve this problem in 
two stages: first, we use Theorem [3] to infer the number of 
measurements needed to guarantee tangent plane e-isometries 
for all frequencies (loosely, pairwise isometries for w\ — > oj-i). 
We then use the continuity of the sinusoidal manifold to extend 
these tangent plane e-isometries to a pairwise 2e-isometry for 
closely-spaced frequencies {(wi,o>2) : \^>i — W2I < /"}■ 

Theorem 5. Suppose that A is an M x N measure- 
ment matrix whose entries are drawn i.i.d. from Uniform 
{±1 / vN , ±j / y/N} . Let x(w) denote a sinusoid ( 1771 ) of 
frequency u. Let H(w) = £™=f |/i„| 2 e^ ( "- (Ar+1)/2) where 
h n are the window weights of the sinusoid. For any e > 0, 



l-e< 



N \\g x Ax(u\) - g 2 Ax.(u 2 )\\ 



M ||5ix(wi)-5 2 x(w2)|| 



< 1+eVgi, ff2.Wi,W2 



with high probability when M = 0(e 2 \og(Ne 1 (1 



rxrn 



-l/— 1„,-1 



)) where 



\dH(0)/duj\, a = 1/(Nt) and ( 



= i/||dxH/dw||, x 

N 2 d 2 \H(0)\ 2 
2 dUP • 



We present the results for closely spaced frequencies first, 
and then move to the well-separated setting. 
Tangent plane isometry: For a single sinusoid, the tangent 
plane matrix at u> is given by T(w) = [x(w) rdx(w) / 'dui] 
where r = l/\\dx(uj)/duj\\. The smallest singular value of 
T(w), denoted by er t angenti satisfies 



o, 



tangent 



x(w), 



efac(oj) 



n=N 

E 

ra=l 



doj 
n \~ju I n 



ft, ' 2 



7V + 1 



where the second equality is obtained from the definition of 
the sinusoid ( fTTT i by noting that the nth entry of x(w) is 
/i„e : '" (n - (JV+1)/2) . From the definition of H(u), we see that, 



2 

a tangent i 



dH{0) 



du> 



and therefore <7t an gent = a/1 — rx where % = |rf7J(0)/dw|. By 
Jensen's inequality, we see that \ 2 < 1/r 2 when the weight 
sequence {h„} has more than one nonzero tap. Thus, t\ < 1 
and therefore cr ta ngent is strictly positive. Setting S = y/1 — t\ 
in Theorem [3] we can provide tangent plane e-isometries for 
a single sinusoid with M = O (e~ 2 log (A^e _1 (l — tx) -1 )) 
measurements. 

Extending tangent plane isometry to pairwise isometry for 
frequencies separated by at most l/N 1 - 5 : We now extend 
e-isometry of the tangent planes to a pairwise 2e-isometry for 
any two frequencies wi, W2 whose separation A = 10-2 — Wi is 
"small" (we quantify how small later) by exploiting continuity. 
Let q = (wi + o->2)/2 be the average of the two frequencies. 
For small values of |A|, a first-order Taylor series expansion 
for x(wi) and x(ct»2) around x(q) will have small errors. Such 



an expansion gives us 



x( Wl ) = x(g) - (A/2) 
x(wa) = x(«) + (A/2) 



^x(g) 

dw 

du 



ei, 



e 2 , 



where e l7 e 2 are the approximation errors. Consider a linear 
combination X(w)g where X(w) = [x(wi) x(w 2 )] and g = 
[ffi 92] ■ This can be written as 



X(w)g = v + e 
where e = giei + g 2 e2 and 

v = (.gi + .92) x(g) + (A/2) (g 2 - 9l ) 



du> 



lies in the span of T(g) = [x(q) T(dx(q)/dtu)}, the tangent 
plane at uj = q. 

Since A guarantees e-isometries for tangent planes at 
all frequencies, for any vector T(q)h in the tangent plane 
at q, the quantity ||AT(g)h|| is bounded within (1 ± 
e) V ; M7^||T(g)h||. Expanding out ||AX(w)g||/||X(w)g|| in 
terms of v and e and applying the tangent plane isometry 
condition to ||Av||/||v||, we can show that 



N ||AX(w)g|| 



M ||X(«)g|| 



< 1 ± 



5ViV||e|| 



where x < y ± z denotes y— z < x < y + z. Next, we get 
bounds on |je|| and ||v|| as follows. First, we use the mean 
value theorem to show that the error is bounded as ||e|| < 
7V 2 A 2 /(4a/2). Next, since v lies in the span of T(q), we 
can use the bound on the minimum singular value of T(q) 
to get ||v|| > \/l — tx|A|/(v / 2t). The details are given in 
Appendix ID] Substituting these bounds in the above equation, 
we obtain 



N ||AX(«)g|| 



< 1 ± 



5t|A|JV 



2.0 



m ||x(w) g || v^ ' Vr-rx. 

We note that r = 1 / \\dx.(u:) / duj\\ scales as l/N. There- 
fore, defining a scale-invariant constant a = l/(7Vr), 
we see that, as long as the frequency separation |A| < 
(4ae-\/(l — tx)/5)/N 1 - 5 , we can get a 2e isometry 



N ||AX(u>)g|| 



< 1 ± 2e. 



M ||X(w)g|| 

Thus, if A provides an e/2 tangent plane isometry 
for all frequencies (which can be achieved with M = 
O (e -2 log (A^e _1 (l — rx) -1 )) measurements), we can ex- 
tend it to a pairwise e-isometry for the set of frequencies W\,W2 
whose separation |wi — W2I < (4a(e/2)>/(l - tx)/5)/N 1 - 5 . 
Pairwise isometry for frequencies separated by more than 
l/N : We now use Theorem |4] to quantify the number of 
measurements necessary to guarantee pairwise e-isometry for 
two frequencies that are separated by more than Cs/N 1 - 5 , 
where C s = (4a(e/2)^/(l - r X )/5). 

First, we obtain a bound on the smallest singular value of 
X(wi,u>2) = [x(wi x(w 2 )]- Denoting the smallest singular 
value by er 2 „ nal , we can show that it satisfies 



signal 



1-K*(«l),*(<"2)>l 
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Furthermore, we can show that |(x(u;i),x(w2)}| = \H(lji — 
W2)|, where H(u) = YZ=i \ h n\ 2 e 3 ^ n ~ (N+l)/2) ■ Thus, we 

haVe Signal = 1 - |if(W! - W 2 )|. 

Suppose now that |wi — W2I > Cs/N 1 - 5 . For large values 
of AT, the smallest singular value of X(wi,W2) is bounded as 



; signal 



> 



0.5CC1 



A r 



where 



c = - 



A^ 2 <P\H(u)\' 



■)\ 



du 2 



j=0 



The details are given in Appendix! 



We now apply Theorem g] with S = ^/Q.bCC 2 s /N. 



1.5 



is 



The set of all frequencies |u>i — W2I > Cs/N 
contained in A p (y / 0.5^C|/A r ) and thus, we can guar- 
antee pairwise e-isometry for this set with M = 
O (e _2 log (A^e" 1 (l — Tx) _1 £ _1 a -1 )) measurements. 

Combining the isometries in the regimes \w\ — W2I < 
Cs/N 1 - 5 and |wi - cj 2 | > Cs/N 1 - 5 completes the proof of 
Theorem \5\ 



VIII. Conclusions and Future Work 

For parameter estimation in AWGN, we have identified 
isometry conditions under which the only effect of making 
compressive measurements is an SNR penalty equal to the 
dimensionality reduction factor. We prove this by establish- 
ing a connection between the isometry conditions and the 
CRB/ZZB. For a mixture of K sinusoids of length N, we 
show that 0(K log NK5~ 1 ) measurements suffice to provide 
such isometries, where 8 is the smallest singular value of 
appropriate matrices (stronger results are obtained for K = 1). 
Based on the connection between the ZZB and CRB, we also 
observe that, in order to avoid large estimation errors, the com- 
pressive measurements must not only preserve the geometry, 
but the SNR after the dimension reduction penalty must also 
be above a threshold. We illustrate this by showing that, for 
frequency estimation for a single sinusoid, the convergence of 
the ZZB to the CRB can be used to tightly predict the number 
of measurements needed to avoid error floors. 

We leave open the issue of establishing the relationship be- 
tween the smallest singular value S and the minimum spacing 
between sinusoids, and whether the stronger isometry results 
established for a single sinusoid can be extended to K > 1, 
Another interesting topic for future work is the development 
of an analytical understanding of multi-dimensional sinusoid 
estimation, motivated by practical applications such as large 
2D arrays for mm-wave communication [11] and imaging. 
Finally, investigation of compressive parameter estimation in 
non-Gaussian settings is an interesting problem with few 
known results. 

Appendix A 
Ziv-Zakai Bound Review 

Consider the problem of estimating a parameter 9 from 
measurements 

y = JBx(6») + z, 9 e 6, z - C7V(0, cr 2 I) 



. For an estimator 9(y), let e = 9(y) — 9 denote the 
estimation error. The ZZB lower bounds the error E|a T e| 2 
for any a <G M. K by relating it to the probabilities of error in a 
sequence of detection problems. We begin by describing one 
of the detection problems. 

Consider a simplified version of the preceding model, 
in which the parameter 9 takes only two values <fi and 
4> + 8, occurring with probabilities p(<fi)/ (p(4>) + p(4> + 8)) 
and p(<p + 8) / {p{4>) + p{4> + 8)), respectively. There are two 
possible ways to estimate 9: 

• Optimal detection-theoretic approach: Compute the 
Bayesian posterior probabilities p(<p\y) and p(<p + S\y). 
Choose <p \f p(4>\y) > p(<fi+8\y) and <p+8 otherwise. Denote 
the probability of error with this approach by f(B, cf>,4> + 8). 

• Heuristic approach using the estimate 9{y): Form the 
estimate 9(y)\ this could take any value in 0, and is not 
restricted to {<p, <p + 8}. Classify based on the following rule: 
if a T 9(y) < & T <p+(h/2), where h — a T S, choose <p to have 
occurred; else, choose <p + 5. Denote the probability of error 
with this scheme by P su i,(B 1 <p,cf> + 5). 

Since the Bayesian detection rule is optimal, we have 
/(B» + S) < P sub (B,0,0 + S). 

In order to use this observation to bound E|a T e| 2 , we begin 
with the identity 



E|a T « 



Pr 



I T I 

a e 



> h/2) h dh, 



(15) 



and relate Pr (|a T e| > h/2) to the probability of error with 
the heuristic rule P su \, (B, <p, <fi + S) as follows: 



Pr 



I T I 

a e 



> h/2) 



(p(<P)+p(<j> + 6)) 
P sub (B,cj>,(j> + 5) dcf>, (16) 



where 8 is any vector satisfying & T 5 = h. We now use the 
lower bound P sub (B, <f>,4> + 8) > f(B, 0, </>+<$) in (O and 
substituting back in ( fl5l l, we get the basic version of the ZZB. 
We can further tighten the bound in two ways: (a) by 
choosing 8 appropriately and (b) by exploiting the fact that 
Pr (|a T e| > h/2) is non-increasing using the valley filling 
operation V{ }, defined as V{q(h)} = max r >o q(h + r) (refer 
Ifl6l for details). This gives us the ZZB in ©. 

Appendix B 
Proof of Theorem[4] 

Let u) = [«i • • ■ uj k ] T , g = [51 ■ • ■ 9k] T and X(w) = 
[xfwi) • ■ ■ x(wr-)], We note that an e-isometry for all vectors 
of the form X(u;)g such that ||X(u;)gj| > 5 and ||g|| = 1 
is equivalent to (TT~4b . We discretize the frequencies [0, 2ir] 
uniformly into R points (R is specified later) and obtain the set 
F. We first prove a 2eo isometry for all vectors in the span of 
X(q) for all frequency tuplets q € F K (i.e., q = [qi ■ ■ ■ qK\ T 
with qi <G F). We then extend this to a 16eo isometry for 
vectors X(u;)g such that ||X(u>)g|| > S and j|g|| = 1 by: (a) 
approximating them to nearby points in the span of X(q), (b) 
choosing R = O(N 1 - 5 K - 5 S~ 1 €q 1 ) so that the approximation 
is good. 
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Sampling: For any tuplet of sampled frequencies q£f 

-i\ 2K 



K 



if A preserves the norm of (6e c 



well-chosen samples 



in the span of X(q) up to eo, it can be shown that A will 
preserve the norms of all vectors in the span of X(q) up to 2eo 
||4]. Since there are R K sampled frequency tuplets q <G F K , 
by demanding that A preserves the norm of R K (Oe^ 1 ) 
samples, we can provide a 2eo isometry for the span of 
X(q) Vq G F K . 

Isometry for mixtures of arbitrary frequencies: We now 
extend this to an 16eo isometry result for vectors of the form 
X(u>)g such that ||X(w)g|| > 5 and ||g|| = 1 by choosing R 
appropriately. 

Let q be a tuplet in F K that is close to u> satisfying 
max; \qi — u>i\ < ir/R. We let e; = x(ui) — x(c#) and bound 
the absolute value of each term of e; using the mean value 
theorem to get ||e;|| < nN/(y/2R). We use this to calculate 
a bound on the difference between a vector X(u>)g and its 
approximated version X(q)g. Using the definition of e/, we 
obtain 

l=K 

X(u;)g = X(q)g + J] 5/ e,. (17) 



/=! 



Using the triangle inequality and the fact that J^i \9i\ < 
(since ||g|| = 1), we have 



l|X(q)g|l 
l|X(w)g| 



< 1± 



x^KNR- 1 

l|XMg|| ' 



>2K 



(18) 



where x < y ± z denotes y — z < x <y + Z. 

Next, we bound the difference between the vectors 
M(w)g and AX(q)g. We see that \\A\\ F = \fM and, 
therefore, have ||Aefc|| < \/]v7||efe||. Furthermore, since A 
preserves the norms of all vectors of the form X(q)g, where 
q <G F K up to an isometry constant 2eo (and scale factor of 



y/M/N), we get 
iV||AX(w)g| 



M ||X(q)g| 



< 1 ± 2e 



nNx^NKR- 
l|X(q)g|| 



(19) 



Before we proceed to give the isometry result, we need 
to characterize how small ||X(q)g|| can be in ( fT9l . Since 
||X(oj)g|| > 5, from ( fT8l we have the following: 

l|X(q)g|| fl?AT(ux\-i 



Choosing R = (ir/2)NyjNKe a l S~ 1 , we have that 
l|X(q)g|| 



|X(w)g|| 



< 1 ± 2e 



(20) 



r JV|| AX(a;)g| 
l|X(q)g|| 



< 1 ± 2e 



2e S 



Using the lower bound ||X(q)g|| 
5(1 - 2co), 

^V||AX(u;)g|| 



> 



l|X(q)g|| 
1 - 2e ) ||XMg|| > 



M ||X(q)g|| 



< 1 ± 8e . 



Substituting the bounds for ||X(q)g|| in terms of ||X(o;)gj 
from (|20] >, we have that 



N ||AX(w)g| 



M ||X(w)g| 



< l±16e . 



Number of measurements: It only remains to specify the 
number of measurements M required to preserve the norms of 
the R K (Qsq 1 ) samples up to eo. Using the value for R just 
obtained, and setting e = 16eo, we see that we must preserve 
the norms of (18x 16 3 7riV 1 - 5 iir - 5 e _3 J _1 )- Kr vectors up to e/16 
w.h.p. We relate the probability of preserving these norms to 
the number of measurements M by deriving a Chernoff bound 
on the deviations of ||Av|| 2 for an arbitrary v e C Ar (proof 
similar to (5)): 



Pr 



TV IIAvl 



- 1 



> e 



~~k Me ' 



(21) 



We employ the union bound and (f2lT) to compute the prob- 
ability that the norm of atleast one sample is not pre- 
served. This probability becomes vanishingly small for M = 
O (e _2 i^log (NKe~ 1 5~ 1 ^ measurements, which concludes 
the proof. 

Appendix C 
Proof of Theorem[3] 

For the matrix T(u>), K of the columns are of the form 
Tchc(u})/duJ, while the remaining K are of the form x(w). 
When Tdx(ui) / dui is approximated by Tdx(q)/du>, where q is 
the frequency on an uniformly spaced frequency grid with R 
points that is the closest to ui, the norm of the approximation 
error is upper bounded by wN/(y/2R). The upper bound on 
the norm of the error in approximating x(w) by x(g) used 
in theorem |4] is also ttN/(\/2R). Therefore, by following the 
proof of theorem |4] with K set to 2K (because number of 
columns of X(o;) is only K), we obtain the proof for theorem 

HI 

Appendix D 
Extending tangent plane isometry 

We first derive a bound on ||e||. Applying the triangle 
inequality to e, we obtain ||e|| < |<7i|||ei|| + |.92|||e2||. Since 
the quantity we wish to bound ||AX(w)g||/||X(u>)g|| does 
not depend on ||g||, we can, without loss of generality, restrict 
attention to ||g|| = 1. Thus, we have |<jt{| < 1. We use 
the mean value theorem to obtain bounds on ||ej|| i = 1,2 
(the mean value theorem relates e^ to <i 2 x(u4)/<iu; 2 for some 
u[ e [wi,w 2 ]) and ultimately get ||e|| < A^ 2 A 2 /(4V2). 

In order to obtain a lower bound for llvll, we rewrite v as 



T(<?) 



y/2 







A 
V2t 



V2 V2 

"75 73 



9i 

.92 



(22) 



We now recall that the minimum singular value of the product 
of two matrices is larger than the product of their minimum 
singular values. The minimum singular value of T(g) is 
^tangent = -\A — T X an( l me corresponding value for the 
other two matrices are |A|/v2t and 1 respectively. Thus, the 



minimum singular value of the product of the three matrices is 
greater than y/l — r\ x -W-. Since ||g|| = 1, we immediately 
get the desired bound on ||v||. 

Appendix E 

Smallest singular value for well-separated 

frequencies 

We wish to obtain a lower bound for the lowest sin- 
gular value of; j of the matrix [X(wi) X(w2)] when the 
frequencies satisfy |wi — W2I > Cs/N 1 - 5 . First, we note 
that this is equivalent to upper-bounding \H{tui — oj2)\ since 
°fienai = 1 ~ \H{ll>x — u>2)\. Since |-ff(w)| is not necessarily 
monotonic (imagine that h n is the Hamming window; |iJ(u;)|, 
being the Fourier transform of \h n \ 2 , has sidelobes), it is 
not true in general that that the maximum of |iJ(w)|, \u\ > 
Cs/N 1 - 5 occurs at w = Cs/N 1 - 5 . However, we now make 
two observations that allow us to analyze the behavior of 
|-ff(w)| only at the minimum separation Cs/N 1 - 5 . 

First, if there were no restrictions on the frequencies 
(wi,W2), \H(u>i— a>2)| has a unique maximum when u\ = 002- 
Second, because the set of the frequencies we are excluding 
\uj\ — W2I < Cs/N 1,5 is very small, the only values that 0J2 
cannot take is a tiny band around wi. Therefore, the maximum 
of |if(wi — W2)|i |wi — W2I > A* i s guaranteed to occur when 
UJ2 = Wi ± Cs/N 1 - 5 (imagine discarding a tiny window of 
frequencies around the peak of the main lobe for the Hamming 
window; the maximum across the remaining frequencies will 
occur at the edge of the discarded window). 

For small values of |wi — W2I1 we can expand |-ff(w)| around 



and use a~ 



mal 



1 - \H(uji - w 2 )| to get 



^signal 



C^V 2 (wi - uj 2 f ± O (iV 4 («i - w 2 ) 4 ) 



_ _ r-2 d 2 \H(u)\ 2 



dcu 2 



where £ = — 

For \u>i — W2I = Cs/N 1 - 5 , the first term in cr 2 ignal , given 
by (C s /N, is finite and the second term decays faster (as 
1/./V 2 ). Therefore, for large values of N, the second term is 
much smaller than the first and of iorml is bounded away from 
zero. In particular, for large enough N (how large it needs to 
be depends on Cs and the behavior of |i?(w)| at ui = 0), we 
have ^ > ^OlCCj/N. 
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